智能论文笔记

Bengali Handwritten Digit Recognition using CNN with Explainable AI

Md Tanvir Rouf Shawon , Raihan Tanvir , Md. Golam Rabiul Alam

分类：计算机视觉 | 机器学习

2022-12-23

Handwritten character recognition is a hot topic for research nowadays. If we can convert a handwritten piece of paper into a text-searchable document using the Optical Character Recognition (OCR) technique, we can easily understand the content and do not need to read the handwritten document. OCR in the English language is very common, but in the Bengali language, it is very hard to find a good quality OCR application. If we can merge machine learning and deep learning with OCR, it could be a huge contribution to this field. Various researchers have proposed a number of strategies for recognizing Bengali handwritten characters. A lot of ML algorithms and deep neural networks were used in their work, but the explanations of their models are not available. In our work, we have used various machine learning algorithms and CNN to recognize handwritten Bengali digits. We have got acceptable accuracy from some ML models, and CNN has given us great testing accuracy. Grad-CAM was used as an XAI method on our CNN model, which gave us insights into the model and helped us detect the origin of interest for recognizing a digit from an image.

translated by 谷歌翻译

Jamdani Motif Generation using Conditional GAN

MD Tanvir Rouf Shawon , Raihan Tanvir , Humaira Ferdous Shifa , Susmoy Kar , Mohammad Imrul Jubair

分类：计算机视觉

2022-12-22

Jamdani is the strikingly patterned textile heritage of Bangladesh. The exclusive geometric motifs woven on the fabric are the most attractive part of this craftsmanship having a remarkable influence on textile and fine art. In this paper, we have developed a technique based on the Generative Adversarial Network that can learn to generate entirely new Jamdani patterns from a collection of Jamdani motifs that we assembled, the newly formed motifs can mimic the appearance of the original designs. Users can input the skeleton of a desired pattern in terms of rough strokes and our system finalizes the input by generating the complete motif which follows the geometric structure of real Jamdani ones. To serve this purpose, we collected and preprocessed a dataset containing a large number of Jamdani motifs images from authentic sources via fieldwork and applied a state-of-the-art method called pix2pix to it. To the best of our knowledge, this dataset is currently the only available dataset of Jamdani motifs in digital format for computer vision research. Our experimental results of the pix2pix model on this dataset show satisfactory outputs of computer-generated images of Jamdani motifs and we believe that our work will open a new avenue for further research.

translated by 谷歌翻译

Effectiveness of Transformer Models on IoT Security Detection in StackOverflow Discussions

Nibir Chandra Mandal , G. M. Shahariar , Md. Tanvir Rouf Shawon

分类：机器学习

2022-07-29

物联网（IoT）是一个新兴的概念，它直接链接到连接到Internet的数十亿个物理项目或“事物”，并且都在收集和在设备和系统之间收集和交换信息。但是，IoT设备并未考虑到安全性，这可能会导致多设备系统中的安全漏洞。传统上，我们通过调查物联网开发商和专家来调查物联网问题。但是，该技术是不可扩展的，因为对所有物联网开发人员进行调查是不可行的。研究物联网问题的另一种方法是在主要在线开发论坛（如Stack Overflow（So））上查看IoT开发人员讨论。但是，发现与物联网问题相关的讨论是具有挑战性的，因为它们经常不属于与IoT相关的术语。在本文中，我们介绍了“ IoT安全数据集”，这是一个针对7147个示例的特定领域数据集，仅针对IoT安全讨论。由于没有自动化工具来标记这些样品，因此我们将其标记为标签。我们进一步采用了多个变压器模型来自动检测安全讨论。通过严格的调查，我们发现物联网安全讨论与传统的安全讨论更加不同，更复杂。当我们从通用数据集“ Opiner”转移知识时，我们证明了跨域数据集上的变压器模型的大量性能损失（多达44％）。因此，我们构建了一个特定于域的IoT安全检测器，F1得分为0.69。我们已经公开了数据集，希望开发人员能够了解有关安全性讨论的更多信息，并且供应商将加强他们对产品安全的担忧。

translated by 谷歌翻译

Brain Tumor Synthetic Data Generation with Adaptive StyleGANs

Usama Tariq , Rizwan Qureshi , Anas Zafar , Danyal Aftab , Jia Wu , Tanvir Alam , Zubair Shah , Hazrat Ali

分类：计算机视觉 | 机器学习

2022-12-04

Generative models have been very successful over the years and have received significant attention for synthetic data generation. As deep learning models are getting more and more complex, they require large amounts of data to perform accurately. In medical image analysis, such generative models play a crucial role as the available data is limited due to challenges related to data privacy, lack of data diversity, or uneven data distributions. In this paper, we present a method to generate brain tumor MRI images using generative adversarial networks. We have utilized StyleGAN2 with ADA methodology to generate high-quality brain MRI with tumors while using a significantly smaller amount of training data when compared to the existing approaches. We use three pre-trained models for transfer learning. Results demonstrate that the proposed method can learn the distributions of brain tumors. Furthermore, the model can generate high-quality synthetic brain MRI with a tumor that can limit the small sample size issues. The approach can addresses the limited data availability by generating realistic-looking brain MRI with tumors. The code is available at: ~\url{https://github.com/rizwanqureshi123/Brain-Tumor-Synthetic-Data}.

translated by 谷歌翻译

Flood Prediction Using Machine Learning Models

Miah Mohammad Asif Syeed , Maisha Farzana , Ishadie Namir , Ipshita Ishrar , Meherin Hossain Nushra , Tanvir Rahman

分类：机器学习

2022-08-02

洪水是大自然最灾难性的灾难之一，对人类生活，农业，基础设施和社会经济系统造成了不可逆转和巨大的破坏。已经进行了几项有关洪水灾难管理和洪水预测系统的研究。实时对洪水的发作和进展的准确预测是具有挑战性的。为了估计大面积的水位和速度，有必要将数据与计算要求的洪水传播模型相结合。本文旨在减少这种自然灾害的极端风险，并通过使用不同的机器学习模型为洪水提供预测来促进政策建议。这项研究将使用二进制逻辑回归，K-Nearest邻居（KNN），支持向量分类器（SVC）和决策树分类器来提供准确的预测。通过结果，将进行比较分析，以了解哪种模型具有更好的准确性。

translated by 谷歌翻译

DDI Prediction via Heterogeneous Graph Attention Networks

Farhan Tanvir , Khaled Mohammed Saifuddin , Esra Akbas

分类：机器学习 | 人工智能

2022-07-12

多药物（定义为使用多种药物）是一种标准治疗方法，尤其是对于严重和慢性疾病。但是，将多种药物一起使用可能会导致药物之间的相互作用。药物 - 药物相互作用（DDI）是一种与另一种药物结合时的影响发生变化时发生的活性。 DDI可能会阻塞，增加或减少药物的预期作用，或者在最坏情况下，会产生不利的副作用。虽然准时检测DDI至关重要，但由于持续时间短，并且在临床试验中识别它们是时间的，而且昂贵，并且要考虑许多可能的药物对进行测试。结果，需要计算方法来预测DDI。在本文中，我们提出了一种新型的异质图注意模型Han-DDI，以预测药物 - 药物相互作用。我们建立了具有不同生物实体的药物网络。然后，我们开发了一个异质的图形注意网络，以使用药物与其他实体的关系学习DDI。它由一个基于注意力的异质图节点编码器组成，用于获得药物节点表示和用于预测药物相互作用的解码器。此外，我们利用全面的实验来评估我们的模型并将其与最先进的模型进行比较。实验结果表明，我们提出的方法Han-DDI的表现可以显着，准确地预测DDI，即使对于新药也是如此。

translated by 谷歌翻译

HyGNN: Drug-Drug Interaction Prediction via Hypergraph Neural Network

Khaled Mohammed Saifuddin , Bri Bumgardnerr , Farhan Tanvir , Esra Akbas

分类：人工智能 | 机器学习

2022-06-25

药物 - 药物相互作用（DDIS）可能会阻碍药物的功能，在最坏的情况下，它们可能导致不良药物反应（ADR）。预测所有DDI是一个具有挑战性且关键的问题。大多数现有的计算模型都集成了来自不同来源的药物中心信息，并利用它们作为机器学习分类器中的功能来预测DDIS。但是，这些模型有很大的失败机会，尤其是对于所有信息都没有可用的新药。本文提出了一个新型的HyperGraph神经网络（HYGNN）模型，仅基于用于DDI预测问题的任何药物的微笑串。为了捕获药物的相似性，我们创建了从微笑字符串中提取的药物的化学子结构中创建的超图。然后，我们开发了由新型的基于注意力的超图边缘编码器组成的HYGNN，以使药物的表示形式和解码器，以预测药物对之间的相互作用。此外，我们进行了广泛的实验，以评估我们的模型并将其与几种最新方法进行比较。实验结果表明，我们提出的HYGNN模型有效地预测了DDI，并以最大的ROC-AUC和PR-AUC分别超过基准，分别为97.9％和98.1％。

translated by 谷歌翻译

Deep Correlation-Aware Kernelized Autoencoders for Anomaly Detection in Cybersecurity

Padmaksha Roy

分类：机器学习

2023-01-01

Unsupervised learning-based anomaly detection in latent space has gained importance since discriminating anomalies from normal data becomes difficult in high-dimensional space. Both density estimation and distance-based methods to detect anomalies in latent space have been explored in the past. These methods prove that retaining valuable properties of input data in latent space helps in the better reconstruction of test data. Moreover, real-world sensor data is skewed and non-Gaussian in nature, making mean-based estimators unreliable for skewed data. Again, anomaly detection methods based on reconstruction error rely on Euclidean distance, which does not consider useful correlation information in the feature space and also fails to accurately reconstruct the data when it deviates from the training distribution. In this work, we address the limitations of reconstruction error-based autoencoders and propose a kernelized autoencoder that leverages a robust form of Mahalanobis distance (MD) to measure latent dimension correlation to effectively detect both near and far anomalies. This hybrid loss is aided by the principle of maximizing the mutual information gain between the latent dimension and the high-dimensional prior data space by maximizing the entropy of the latent space while preserving useful correlation information of the original data in the low-dimensional latent space. The multi-objective function has two goals -- it measures correlation information in the latent feature space in the form of robust MD distance and simultaneously tries to preserve useful correlation information from the original data space in the latent space by maximizing mutual information between the prior and latent space.

translated by 谷歌翻译

Exploring the Use of Data-Driven Approaches for Anomaly Detection in the Internet of Things (IoT) Environment

Eleonora Achiluzzi , Menglu Li , Md Fahd Al Georgy , Rasha Kashef

分类：机器学习

2022-12-31

The Internet of Things (IoT) is a system that connects physical computing devices, sensors, software, and other technologies. Data can be collected, transferred, and exchanged with other devices over the network without requiring human interactions. One challenge the development of IoT faces is the existence of anomaly data in the network. Therefore, research on anomaly detection in the IoT environment has become popular and necessary in recent years. This survey provides an overview to understand the current progress of the different anomaly detection algorithms and how they can be applied in the context of the Internet of Things. In this survey, we categorize the widely used anomaly detection machine learning and deep learning techniques in IoT into three types: clustering-based, classification-based, and deep learning based. For each category, we introduce some state-of-the-art anomaly detection methods and evaluate the advantages and limitations of each technique.

translated by 谷歌翻译

A Comparison Study of Deep CNN Architecture in Detecting of Pneumonia

Al Mohidur Rahman Porag , Md. Mahedi Hasan , Dr. Md Taimur Ahad

分类：计算机视觉 | 机器学习

2022-12-30

Pneumonia, a respiratory infection brought on by bacteria or viruses, affects a large number of people, especially in developing and impoverished countries where high levels of pollution, unclean living conditions, and overcrowding are frequently observed, along with insufficient medical infrastructure. Pleural effusion, a condition in which fluids fill the lung and complicate breathing, is brought on by pneumonia. Early detection of pneumonia is essential for ensuring curative care and boosting survival rates. The approach most usually used to diagnose pneumonia is chest X-ray imaging. The purpose of this work is to develop a method for the automatic diagnosis of bacterial and viral pneumonia in digital x-ray pictures. This article first presents the authors' technique, and then gives a comprehensive report on recent developments in the field of reliable diagnosis of pneumonia. In this study, here tuned a state-of-the-art deep convolutional neural network to classify plant diseases based on images and tested its performance. Deep learning architecture is compared empirically. VGG19, ResNet with 152v2, Resnext101, Seresnet152, Mobilenettv2, and DenseNet with 201 layers are among the architectures tested. Experiment data consists of two groups, sick and healthy X-ray pictures. To take appropriate action against plant diseases as soon as possible, rapid disease identification models are preferred. DenseNet201 has shown no overfitting or performance degradation in our experiments, and its accuracy tends to increase as the number of epochs increases. Further, DenseNet201 achieves state-of-the-art performance with a significantly a smaller number of parameters and within a reasonable computing time. This architecture outperforms the competition in terms of testing accuracy, scoring 95%. Each architecture was trained using Keras, using Theano as the backend.

translated by 谷歌翻译